PrefixTreeESpan: A Pattern Growth Algorithm for Mining Embedded Subtrees

نویسندگان

  • Lei Zou
  • Yansheng Lu
  • Huaming Zhang
  • Rong Hu
چکیده

Frequent embedded subtree pattern mining is an important data mining problem with broad applications. In this paper, we propose a novel embedded subtree mining algorithm, called PrefixTreeESpan (i.e. Prefix-Treeprojected Embedded-Subtree pattern), which finds a subtree pattern by growing a frequent prefix-tree. Thus, using divide and conquer, mining local length-1 frequent subtree patterns in Prefix-Tree-Projected database recursively will lead to the complete set of frequent patterns. Different from Chopper and XSpanner [4], PrefixTreeESpan does not need a checking process. Our performance study shows that PrefixTreeESpan outperforms Apriori-like algorithm: TreeMiner [6], and pattern-growth algorithms :Chopper , XSpanner .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding

Tree mining has recently attracted a lot of interest in areas such as Bioinformatics, XML mining, Web mining, etc. We are mainly concerned with mining frequent induced and embedded subtrees. While more interesting patterns can be obtained when mining embedded subtrees, unfortunately mining such embedding relationships can be very costly. In this paper, we propose an efficient approach to tackle...

متن کامل

Discovering Frequent Embedded Subtree Patterns from Large Databases of Unordered Labeled Trees

Recent years have witnessed a surge of research interest in knowledge discovery from data domains with complex structures, such as trees and graphs. In this paper, we address the problem of mining maximal frequent embedded subtrees which is motivated by such important applications as mining “hot” spots of Web sites from Web usage logs and discovering significant “deep” structures from tree-like...

متن کامل

Mining Induced and Embedded Subtrees in Ordered, Unordered, and Partially-Ordered Trees

Many data mining problems can be represented with nonlinear data structures like trees. In this paper, we introduce a scalable algorithm to mine partially-ordered trees. Our algorithm, POTMiner, is able to identify both induced and embedded subtrees and, as special cases, it can handle both completely ordered and completely unordered trees (i.e. the particular situations existing algorithms add...

متن کامل

MB3-Miner: mining eMBedded subTREEs using Tree Model Guided candidate generation

Tree mining has many useful applications in areas such as Bioinformatics, XML mining, Web mining, etc. In general, most of the formally represented information in these domains is a tree structured form. In this paper we focus on mining frequent embedded subtrees from databases of rooted labeled ordered subtrees. We propose a novel and unique embedding list representation that is suitable for d...

متن کامل

PCITMiner- Prefix-based Closed Induced Tree Miner for finding closed induced frequent subtrees

Frequent subtree mining has attracted a great deal of interest among the researchers due to its application in a wide variety of domains. Some of the domains include bio informatics, XML processing, computational linguistics, and web usage mining. Despite the advances in frequent subtree mining, mining for the entire frequent subtrees is infeasible due to the combinatorial explosion of the freq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006